{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pandas怎样实现Excel的vlookup并且在指定列后面输出？\n",
    "\n",
    "背景：  \n",
    "1. 有两个excel，他们有相同的一个列；  \n",
    "2. 按照这个列合并成一个大的excel，即vlookup功能，要求：  \n",
    "    * 只需要第二个excel的少量的列，比如从40个列中挑选2个列  \n",
    "    * 新增的来自第二个excel的列需要放到第一个excel指定的列后面；  \n",
    "3. 将结果输出到一个新的excel;\n",
    "\n",
    "微信公众号：蚂蚁学Python"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 步骤1：读取两个数据表"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>班级</th>\n",
       "      <th>学号</th>\n",
       "      <th>语文成绩</th>\n",
       "      <th>数学成绩</th>\n",
       "      <th>英语成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>C01</td>\n",
       "      <td>S001</td>\n",
       "      <td>99</td>\n",
       "      <td>84</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>C01</td>\n",
       "      <td>S002</td>\n",
       "      <td>66</td>\n",
       "      <td>95</td>\n",
       "      <td>77</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>C01</td>\n",
       "      <td>S003</td>\n",
       "      <td>68</td>\n",
       "      <td>68</td>\n",
       "      <td>61</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>C01</td>\n",
       "      <td>S004</td>\n",
       "      <td>63</td>\n",
       "      <td>66</td>\n",
       "      <td>82</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>C01</td>\n",
       "      <td>S005</td>\n",
       "      <td>72</td>\n",
       "      <td>95</td>\n",
       "      <td>94</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    班级    学号  语文成绩  数学成绩  英语成绩\n",
       "0  C01  S001    99    84    88\n",
       "1  C01  S002    66    95    77\n",
       "2  C01  S003    68    68    61\n",
       "3  C01  S004    63    66    82\n",
       "4  C01  S005    72    95    94"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 学生成绩表\n",
    "df_grade = pd.read_excel(\"./course_datas/c23_excel_vlookup/学生成绩表.xlsx\") \n",
    "df_grade.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>学号</th>\n",
       "      <th>姓名</th>\n",
       "      <th>性别</th>\n",
       "      <th>年龄</th>\n",
       "      <th>籍贯</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>S001</td>\n",
       "      <td>怠涵</td>\n",
       "      <td>女</td>\n",
       "      <td>23</td>\n",
       "      <td>山东</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>S002</td>\n",
       "      <td>婉清</td>\n",
       "      <td>女</td>\n",
       "      <td>25</td>\n",
       "      <td>河南</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>S003</td>\n",
       "      <td>溪榕</td>\n",
       "      <td>女</td>\n",
       "      <td>23</td>\n",
       "      <td>湖北</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>S004</td>\n",
       "      <td>漠涓</td>\n",
       "      <td>女</td>\n",
       "      <td>19</td>\n",
       "      <td>陕西</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>S005</td>\n",
       "      <td>祈博</td>\n",
       "      <td>女</td>\n",
       "      <td>24</td>\n",
       "      <td>山东</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     学号  姓名 性别  年龄  籍贯\n",
       "0  S001  怠涵  女  23  山东\n",
       "1  S002  婉清  女  25  河南\n",
       "2  S003  溪榕  女  23  湖北\n",
       "3  S004  漠涓  女  19  陕西\n",
       "4  S005  祈博  女  24  山东"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 学生信息表\n",
    "df_sinfo = pd.read_excel(\"./course_datas/c23_excel_vlookup/学生信息表.xlsx\") \n",
    "df_sinfo.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***目标：怎样将第二个“学生信息表”的姓名、性别两列，添加到第一个表“学生成绩表”，并且放在第一个表的“学号”列后面？***"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 步骤2：实现两个表的关联\n",
    "\n",
    "即excel的vloopup功能"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>学号</th>\n",
       "      <th>姓名</th>\n",
       "      <th>性别</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>S001</td>\n",
       "      <td>怠涵</td>\n",
       "      <td>女</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>S002</td>\n",
       "      <td>婉清</td>\n",
       "      <td>女</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>S003</td>\n",
       "      <td>溪榕</td>\n",
       "      <td>女</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>S004</td>\n",
       "      <td>漠涓</td>\n",
       "      <td>女</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>S005</td>\n",
       "      <td>祈博</td>\n",
       "      <td>女</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     学号  姓名 性别\n",
       "0  S001  怠涵  女\n",
       "1  S002  婉清  女\n",
       "2  S003  溪榕  女\n",
       "3  S004  漠涓  女\n",
       "4  S005  祈博  女"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 只筛选第二个表的少量的列\n",
    "df_sinfo = df_sinfo[[\"学号\", \"姓名\", \"性别\"]]\n",
    "df_sinfo.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>班级</th>\n",
       "      <th>学号</th>\n",
       "      <th>语文成绩</th>\n",
       "      <th>数学成绩</th>\n",
       "      <th>英语成绩</th>\n",
       "      <th>姓名</th>\n",
       "      <th>性别</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>C01</td>\n",
       "      <td>S001</td>\n",
       "      <td>99</td>\n",
       "      <td>84</td>\n",
       "      <td>88</td>\n",
       "      <td>怠涵</td>\n",
       "      <td>女</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>C01</td>\n",
       "      <td>S002</td>\n",
       "      <td>66</td>\n",
       "      <td>95</td>\n",
       "      <td>77</td>\n",
       "      <td>婉清</td>\n",
       "      <td>女</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>C01</td>\n",
       "      <td>S003</td>\n",
       "      <td>68</td>\n",
       "      <td>68</td>\n",
       "      <td>61</td>\n",
       "      <td>溪榕</td>\n",
       "      <td>女</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>C01</td>\n",
       "      <td>S004</td>\n",
       "      <td>63</td>\n",
       "      <td>66</td>\n",
       "      <td>82</td>\n",
       "      <td>漠涓</td>\n",
       "      <td>女</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>C01</td>\n",
       "      <td>S005</td>\n",
       "      <td>72</td>\n",
       "      <td>95</td>\n",
       "      <td>94</td>\n",
       "      <td>祈博</td>\n",
       "      <td>女</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    班级    学号  语文成绩  数学成绩  英语成绩  姓名 性别\n",
       "0  C01  S001    99    84    88  怠涵  女\n",
       "1  C01  S002    66    95    77  婉清  女\n",
       "2  C01  S003    68    68    61  溪榕  女\n",
       "3  C01  S004    63    66    82  漠涓  女\n",
       "4  C01  S005    72    95    94  祈博  女"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_merge = pd.merge(left=df_grade, right=df_sinfo, left_on=\"学号\", right_on=\"学号\")\n",
    "df_merge.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 步骤3：调整列的顺序"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['班级', '学号', '语文成绩', '数学成绩', '英语成绩', '姓名', '性别'], dtype='object')"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_merge.columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 问题：怎样将'姓名', '性别'两列，放到'学号'的后面？\n",
    "\n",
    "接下来需要用Python的语法实现列表的处理"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['班级', '学号', '语文成绩', '数学成绩', '英语成绩', '姓名', '性别']"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 将columns变成python的列表形式\n",
    "new_columns = df_merge.columns.to_list()\n",
    "new_columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 按逆序insert，会将\"姓名\"，\"性别\"放到\"学号\"的后面\n",
    "for name in [\"姓名\", \"性别\"][::-1]:\n",
    "    new_columns.remove(name)\n",
    "    new_columns.insert(new_columns.index(\"学号\")+1, name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['班级', '学号', '姓名', '性别', '语文成绩', '数学成绩', '英语成绩']"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "new_columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>班级</th>\n",
       "      <th>学号</th>\n",
       "      <th>姓名</th>\n",
       "      <th>性别</th>\n",
       "      <th>语文成绩</th>\n",
       "      <th>数学成绩</th>\n",
       "      <th>英语成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>C01</td>\n",
       "      <td>S001</td>\n",
       "      <td>怠涵</td>\n",
       "      <td>女</td>\n",
       "      <td>99</td>\n",
       "      <td>84</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>C01</td>\n",
       "      <td>S002</td>\n",
       "      <td>婉清</td>\n",
       "      <td>女</td>\n",
       "      <td>66</td>\n",
       "      <td>95</td>\n",
       "      <td>77</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>C01</td>\n",
       "      <td>S003</td>\n",
       "      <td>溪榕</td>\n",
       "      <td>女</td>\n",
       "      <td>68</td>\n",
       "      <td>68</td>\n",
       "      <td>61</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>C01</td>\n",
       "      <td>S004</td>\n",
       "      <td>漠涓</td>\n",
       "      <td>女</td>\n",
       "      <td>63</td>\n",
       "      <td>66</td>\n",
       "      <td>82</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>C01</td>\n",
       "      <td>S005</td>\n",
       "      <td>祈博</td>\n",
       "      <td>女</td>\n",
       "      <td>72</td>\n",
       "      <td>95</td>\n",
       "      <td>94</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    班级    学号  姓名 性别  语文成绩  数学成绩  英语成绩\n",
       "0  C01  S001  怠涵  女    99    84    88\n",
       "1  C01  S002  婉清  女    66    95    77\n",
       "2  C01  S003  溪榕  女    68    68    61\n",
       "3  C01  S004  漠涓  女    63    66    82\n",
       "4  C01  S005  祈博  女    72    95    94"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_merge = df_merge.reindex(columns=new_columns)\n",
    "df_merge.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 步骤4：输出最终的Excel文件"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_merge.to_excel(\"./course_datas/c23_excel_vlookup/合并后的数据表.xlsx\", index=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
